Add the heuristic of AddN op using reduce_sum op for parsing pb file (TF) #4251

kentqian · 2025-08-20T19:38:04Z

Motivation

Given a heuristic parsing solution for AddN op when trying to parse tf and support the concat op when having the mix shapes

Technical Details

Change the chain addition to reduce_sum op for parsing AddN op

If there is a mix of static and dynamic shapes, set everything to dynamic, then at the end, contract the shape back to static if possible. It also calculates the common non axis dims to bound the output. (Concat)

Test Plan

Add test cases in ref and tf/parse

test/tf/tests/addn_test.cpp
test/ref/add.cpp

Test Result

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

…rflow 2. Update the concat op's logic to handle mixed static and dynamic shapes more effectively. 3. Replaced the dynamic dimension handling in concat with compute_common_dyn_dims.

…IGraphX into kqian/addn_with_reduce

generated addn tf file regarding to many elements

…d to static

…ian/addn_with_reduce

migraphx-bot · 2025-08-29T06:22:42Z

Test	Batch	Rate new cd7e8a	Rate old 397919	Diff	Compare
torchvision-resnet50	64	3,160.32	3,245.86	-2.64%	✅
torchvision-resnet50_fp16	64	6,595.68	6,951.81	-5.12%	🔴
torchvision-densenet121	32	2,434.48	2,449.22	-0.60%	✅
torchvision-densenet121_fp16	32	4,100.40	4,167.34	-1.61%	✅
torchvision-inceptionv3	32	1,665.16	1,635.29	1.83%	✅
torchvision-inceptionv3_fp16	32	2,582.21	2,759.38	-6.42%	🔴
cadene-inceptionv4	16	794.19	770.72	3.05%	🔆
cadene-resnext64x4	16	802.60	817.99	-1.88%	✅
slim-mobilenet	64	8,210.79	7,456.32	10.12%	🔆
slim-nasnetalarge	64	221.72	210.95	5.11%	🔆
slim-resnet50v2	64	3,297.22	3,341.58	-1.33%	✅
bert-mrpc-onnx	8	1,131.58	1,144.86	-1.16%	✅
bert-mrpc-tf	1	480.89	445.07	8.05%	🔆
pytorch-examples-wlang-gru	1	297.10	299.79	-0.90%	✅
pytorch-examples-wlang-lstm	1	412.36	399.30	3.27%	🔆
torchvision-resnet50_1	1	798.59	761.18	4.92%	🔆
cadene-dpn92_1	1	411.51	384.27	7.09%	🔆
cadene-resnext101_1	1	360.78	391.94	-7.95%	🔴
onnx-taau-downsample	1	396.77	395.58	0.30%	✅
dlrm-criteoterabyte	1	31.90	33.78	-5.55%	🔴
dlrm-criteoterabyte_fp16	1	50.94	51.23	-0.56%	✅
agentmodel	1	8,718.02	9,034.65	-3.50%	🔴
unet_fp16	2	58.73	59.18	-0.77%	✅
resnet50v1_fp16	1	976.71	989.25	-1.27%	✅
resnet50v1_int8	1	970.28	1,022.00	-5.06%	🔴
bert_base_cased_fp16	64	1,109.36	1,106.73	0.24%	✅
bert_large_uncased_fp16	32	343.68	345.26	-0.46%	✅
bert_large_fp16	1	197.32	197.13	0.09%	✅
distilgpt2_fp16	16	2,096.42	2,115.80	-0.92%	✅
yolov5s	1	581.41	576.03	0.93%	✅
tinyllama	1	43.75	43.97	-0.49%	✅
vicuna-fastchat	1	45.06	45.28	-0.48%	✅
whisper-tiny-encoder	1	409.30	417.53	-1.97%	✅
whisper-tiny-decoder	1	411.18	408.53	0.65%	✅
llama2_7b	1	19.11	19.16	-0.26%	✅
qwen1.5-7b	1	23.44	23.51	-0.32%	✅
phi3-3.8b	1	26.53	26.67	-0.53%	✅
mask-rcnn	1	12.08	12.01	0.60%	✅
llama3-8b	1	21.65	21.72	-0.32%	✅
whisper-large-encoder	1	10.16	10.21	-0.48%	✅
whisper-large-decoder	1	96.84	95.77	1.12%	✅
mistral-7b	1	23.62	23.72	-0.41%	✅
FLUX.1-schnell	1	713.58	746.70	-4.44%	🔴
nan	nan	nan	nan	nan%	❌

This build is not recommended to merge 🔴

migraphx-bot · 2025-08-29T06:22:44Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

❌bert-mrpc-tf: ERROR - check error output

error: unknown warning option '-Wnrvo' [-Werror,-Wunknown-warning-option]

error: unknown warning option '-Wnrvo' [-Werror,-Wunknown-warning-option]

2025-08-29 00:03:16.115752: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 359, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 306, in main
graph = load_tf_graph(model_name)
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 300, in load_tf_graph
graph_def.ParseFromString(f.read())
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/lib/io/file_io.py", line 116, in read
self._preread_check()
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/lib/io/file_io.py", line 77, in _preread_check
self._read_buf = _pywrap_file_io.BufferedInputStream(
tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme '[local]' not implemented (file: '/new-saved-models/tf-misc/bert_mrpc1.pb')

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

✅ bert_large: PASSED: MIGraphX meets tolerance

✅ yolov5s: PASSED: MIGraphX meets tolerance

✅ tinyllama: PASSED: MIGraphX meets tolerance

✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

✅ llama2_7b: PASSED: MIGraphX meets tolerance

✅ qwen1.5-7b: PASSED: MIGraphX meets tolerance

✅ phi3-3.8b: PASSED: MIGraphX meets tolerance

🔴mask-rcnn: FAILED: MIGraphX is not within tolerance - check verbose output

✅ llama3-8b: PASSED: MIGraphX meets tolerance

✅ whisper-large-decoder: PASSED: MIGraphX meets tolerance

✅ mistral-7b: PASSED: MIGraphX meets tolerance

✅ FLUX.1-schnell: PASSED: MIGraphX meets tolerance

Copilot

Pull Request Overview

This PR adds a heuristic optimization for parsing TensorFlow's AddN operation by switching from chain addition to reduce_sum for cases with many input tensors (>=5). Additionally, it enhances the concat operation to handle mixed static and dynamic shapes by converting everything to dynamic and then contracting back to static when possible.

Implements a heuristic in AddN parsing that uses concat+reduce_sum+squeeze for 5+ inputs instead of chained addition
Updates concat operation to handle mixed static/dynamic shapes by unifying to dynamic then contracting to static when possible
Adds comprehensive test coverage for the new AddN heuristic with 10 input tensors

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/tf/parse_addn.cpp	Implements heuristic to use reduce_sum approach for AddN with 5+ inputs
src/include/migraphx/op/concat.hpp	Enhances concat to handle mixed static/dynamic shapes and compute common dimensions
test/tf/tests/addn_test.cpp	Adds test case for AddN with 10 elements using the new heuristic
test/tf/gen_tf_pb.py	Adds generator function for the new AddN test case
test/tf/models/addn_with_many_elements_test.pb	Binary protobuf model for testing AddN with many elements
test/ref/add.cpp	Adds reference test for the reduce_sum approach to AddN

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-02T23:15:03Z

src/tf/parse_addn.cpp

+        if(args.size() == 1)
+            return args[0];
+
+        if(args.size() < 5) // using heuristic when args exceed over 5 elements


The comment is inconsistent with the condition. The condition checks for 'less than 5' but the comment says 'when args exceed over 5 elements'. The comment should read 'using chain addition when args are less than 5 elements' or similar.

Suggested change

if(args.size() < 5) // using heuristic when args exceed over 5 elements

if(args.size() < 5) // using chain addition when args are less than 5 elements

Copilot · 2025-09-02T23:15:04Z

src/include/migraphx/op/concat.hpp

+        if(has_static and has_dynamic)
        {
-            // Dynamic input shapes
-            for(std::size_t index = 0; index < inputs[0].ndim(); index++)
+            for(auto& input : inputs)
            {
-                if(index != axis)
-                {
-                    if(not std::all_of(inputs.begin(), inputs.end(), [&](const shape& s) {
-                           return s.dyn_dims()[index] == inputs[0].dyn_dims()[index];
-                       }))
-                        MIGRAPHX_THROW("CONCAT: all input dimensions should match in axis " +
-                                       std::to_string(index));
-                }
-            }
-            std::size_t new_min = 0;
-            std::size_t new_max = 0;
-            for(const auto& input : inputs)
-            {
-                auto ddim = input.dyn_dims()[axis];
-                new_min += ddim.min;
-                new_max += ddim.max;
+                if(input.dynamic())
+                    continue;
+                input = input.to_dynamic();
            }
+        }


Modifying the inputs parameter directly can lead to unexpected side effects since it's passed by value but contains references to shapes. This modification affects the original shapes which may be used elsewhere. Consider creating a local copy of inputs or using a different approach to handle mixed shapes.

1. AddN use reduce_sum op for tensorflow

876fbfc

kentqian requested review from eddieliao and kahmed10 August 20, 2025 19:38

kentqian self-assigned this Aug 20, 2025

kentqian added enhancement New feature or request bugfix Fixes a bug found in the code. labels Aug 20, 2025

kentqian added 12 commits August 20, 2025 18:10

1. Use the heuristic with reduce_sum op when parsing AddN op in tenso…

c3e1b05

…rflow 2. Update the concat op's logic to handle mixed static and dynamic shapes more effectively. 3. Replaced the dynamic dimension handling in concat with compute_common_dyn_dims.

Merge branch 'kqian/addn_with_reduce' of https://github.com/ROCm/AMDM…

1b88398

…IGraphX into kqian/addn_with_reduce

added the reference test

1ece3f1

generated addn tf file regarding to many elements

transform instead of for loop

d612e20

tidy check

9d45b3f

Modified concat logic for when dynamic shapes are able to be converte…

6888e8a

…d to static

format

ace04a8

added the parse addn tf test

5f78453

format

d80f76a

Merge branch 'develop' of https://github.com/ROCm/AMDMIGraphX into kq…

7d89704

…ian/addn_with_reduce

tidy and license

5de1a06

format

cd7e8a6

kentqian marked this pull request as ready for review August 28, 2025 22:51

kentqian requested a review from causten as a code owner August 28, 2025 22:51

kentqian changed the title ~~AddN use the heuristic using reduce_sum op for tensorflow~~ Add the heuristic of AddN op using reduce_sum op for parsing pb file (TF) Aug 28, 2025

causten requested a review from Copilot September 2, 2025 23:14

Copilot AI reviewed Sep 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add the heuristic of AddN op using reduce_sum op for parsing pb file (TF) #4251

Add the heuristic of AddN op using reduce_sum op for parsing pb file (TF) #4251

kentqian commented Aug 20, 2025 •

edited

Loading

Uh oh!

migraphx-bot commented Aug 29, 2025

Uh oh!

migraphx-bot commented Aug 29, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Sep 2, 2025

Uh oh!

Copilot AI Sep 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	if(args.size() < 5) // using heuristic when args exceed over 5 elements
	if(args.size() < 5) // using chain addition when args are less than 5 elements

Add the heuristic of AddN op using reduce_sum op for parsing pb file (TF) #4251

Are you sure you want to change the base?

Add the heuristic of AddN op using reduce_sum op for parsing pb file (TF) #4251

Conversation

kentqian commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

migraphx-bot commented Aug 29, 2025

Uh oh!

migraphx-bot commented Aug 29, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kentqian commented Aug 20, 2025 •

edited

Loading